Fitting New Speakers Based on a Short Untranscribed Sample

نویسندگان

  • Eliya Nachmani
  • Adam Polyak
  • Yaniv Taigman
  • Lior Wolf
چکیده

Learning-based Text To Speech systems have the potential to generalize from one speaker to the next and thus require a relatively short sample of any new voice. However, this promise is currently largely unrealized. We present a method that is designed to capture a new speaker from a short untranscribed audio sample. This is done by employing an additional network that given an audio sample, places the speaker in the embedding space. This network is trained as part of the speech synthesis system using various consistency losses. Our results demonstrate a greatly improved performance on both the dataset speakers, and, more importantly, when fitting new voices, even from very short samples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Speaker-Independent Lipreading with Domain-Adversarial Training

We present a Lipreading system, i.e. a speech recognition system using only visual features, which uses domain-adversarial training for speaker independence. Domain-adversarial training is integrated into the optimization of a lipreader based on a stack of feedforward and LSTM (Long Short-Term Memory) recurrent neural networks, yielding an end-to-end trainable system which only requires a very ...

متن کامل

Signal detection Using Rational Function Curve Fitting

In this manuscript, we proposed a new scheme in communication signal detection which is respect to the curve shape of received signal and based on the extraction of curve fitting (CF) features. This feature extraction technique is proposed for signal data classification in receiver. The proposed scheme is based on curve fitting and approximation of rational fraction coefficients. For each symbo...

متن کامل

Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks

This paper investigates the semi-supervised training for deep neural network-based acoustic models (AM). In the conventional self-learning approach, a “seed-AM” is first trained by using a small transcribed data set. Then, a large untranscribed data set is decoded by using the seed-AM to create a transcription, which is finally used to train a new AM on the entire data. Our investigation in thi...

متن کامل

Bayesian Sample size Determination for Longitudinal Studies with Continuous Response using Marginal Models

Introduction Longitudinal study designs are common in a lot of scientific researches, especially in medical, social and economic sciences. The reason is that longitudinal studies allow researchers to measure changes of each individual over time and often have higher statistical power than cross-sectional studies. Choosing an appropriate sample size is a crucial step in a successful study. A st...

متن کامل

Modeling the Transport and Volumetric Properties of Solutions Containing Polymer and Electrolyte with New Model

A new theoretical model based on the local composition concept (TNRF-mNRTL model) was proposed to express the short-range contribution of the excess Gibbs energy for the solutions containing polymer and electrolyte. This contribution of interaction along with the long-range contribution of interaction (Pitzer-Debye-Hückel equation), configurational entropy of mixing (Flory-Huggins relation)...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1802.06984  شماره 

صفحات  -

تاریخ انتشار 2018